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Currently, customer's product review opinion plays an essential role in 
deciding the purchasing of the online product. A customer prefers to acquire 
the opinion of other customers by viewing their opinion during online 
products' reviews, blogs and social networking sites, etc. The majority of the 
product reviews including huge words. A few users provide the opinion; it is 
tough to analysis and understands the meaning of reviews. To improve user 
fulfillment and shopping experience, it has become a general practice for 
online sellers to allow their users to review or to communicate opinions of 
the products that they have sold. The major goal of the paper is to solve 
feature extraction problem and opinion classification problem from 
customers utilized product reviews which extract the feature words and 
opinion words from product reviews. To propose an Efficient Feature 
Extraction and Classification (EFEC) algorithm is implementing to extracts a 
feature from opinion words. The reviewer usually marks both positive and 
negative parts of the reviewed product, despite the fact that their general 
opinion on the product may be positive or negative. An EFEC algorithm is 
utilized to predict the number of positive and negative opinion in reviews. 
Based on Experimental evaluations, proposed algorithm improves accuracy 
15.05%, precision 13.7%, recall 15.59% and F-measure 15.07% of the 
proposed system compared than existing methodologies 
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1. INTRODUCTION 


Nowadays, the usage of the Internet enhances at a rapid rate across a wide variety of fields. The 


rapid development of e-commerce more and more products are bought on the web. Most of the peoples do 
shopping in online. Not only have the peoples also tended to distribute their experience regarding products on 
the internet. To improve user fulfillment and shopping experience, it has become a general practice for online 
sellers to allow their users to review or to communicate opinions of the products that they have sold. Some 
online users increase the number of reviews conveyed on the internet and also increases at a rapid rate. Some 
of the products have a lot of reviews. Understanding the user's suggestions about products is very useful for 
sellers and users who are willing to purchase those products in the future. 

Opinion mining is the procedure of mining the opinion word and targets about the specific product. 
Opinion Mining is described by the task of discovering the opinions of different people concerning different 
targets. It is interdisciplinary research-oriented phenomenon including fields like natural language 
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processing, text mining and analysis, processing and fetching fruitful and subjective data. However, 
automatic detection and investigation of opinions about products, brands, political issues, and so forth is an 
overwhelming task. Opinion mining includes three chief components: feature and feature-of relations, 
opinion expressions and the related opinion attributes (e.g., polarity), and feature-opinion relations. An 
opinion lexicon is listing of opinion expressions or set of adjectives, which are utilized to demonstrate 
opinions like positive, negative or neutral. Analyzing all reviews is not efficient when the review amount is 
huge. Sometimes, the review words are created confusions. The majority of the product reviews including 
huge words. A few users provide the opinion, and it is tough to analysis and understands the meaning of 
reviews. The customer review sentences were tagged, opinion words were extracted, and opinion orientations 
were recognized using the semantic orientation of opinion words. However Positive and negative opinion 
was identified, but the ranking is not applied to products. The ranking is very useful for users or merchants 
for buying products. An investigation of online user reviews in which firms cannot determine what accurately 
people liked and did not like in document-level and sentence-level opinion mining. Hence, nowadays opinion 
mining research is in phrase-level opinion mining. It performs a fine-grained examination and directly looks 
at the opinion in online reviews. However, social media sites reporting customer opinions of products in 
various formats. Monitoring the product opinion is very difficult to find customer reviews. 

Shoiab Ahmed et al. [1] designed Sentimental Analysis and Opinion Mining based on 
SentiWordNet, which produces a count of score words into seven classifications utilizing machine learning 
algorithms. The web data is gathered utilizing web crawler applied with various preprocessing systems which 
consist exclusion of stop-words from online reviews, at that point stemming is performed utilizing Porter 
Stemmer algorithm, and then reviews are labeled utilizing Stanford POS tagger. Rowida Alfrjani et al. [2] 
introduced a new technique for semantic modeling of the domain knowledge for opinion mining. It focused 
on modeling the domain knowledge in such way that it can be translated a formal ontology, which would 
then be able to be automatically improved with ground facts acquired from public Linked Open Data assets. 

Nitu Kumari et al. [3] developed the various suggestions to sentiment analysis significantly machine 
learning sentiment analysis mainly machine learning. E-commerce is not simply purchasing and selling over 
the Internet; rather it enhances the effectiveness of different rival giants in the market. Opinion mining can 
contain our product at internet by rating, star system, review of the product. Dhanalakshmi et al. [4] designed 
Opinion Mining technique for classifying the understudies' feedback obtained during module assessment 
survey that is conducted each semester to know the feedback of understudies concerning different features of 
teaching and learning. Opinion mining on the understudy feedback produced through surveys utilizing 
supervised machine learning algorithms implemented through Rapid Miner. Balahadia et al. [5] developed a 
teacher's performance assessment tool utilizing opinion mining with sentiment investigation. It gave the 
sentiment score from the qualitative data and numerical response rating from the quantitative data of 
teacher's assessment. 

Solai Ananth et al. [6] designed the Sentence level Categorizer is utilized for assembling the datasets 
from Twitter. Datasets area unit was tokenized by TOKENIZER. The tokens area unit was handled by 
knowledge PREPROCESSING. Naive Baye's Classifier was classifying the datasets; it is most efficient 
classifier for Sentence-Level Categorization. Deshmukh et al. [7] developed a bipartite graph clustering was 
utilized to reduce the mismatch between domain-specific words of source domain and target domain. A 
domain-independent word was utilized to cluster domain-specific words from a source and target domains. A 
trained classifier for the target domain, clustering was utilized as it reduced the gap between domain-specific 
words of various domains. Marrese-Taylor et al. [8] introduced a replicability issues in syntactic centric 
aspect-based opinion mining. It focused syntactic techniques, which tend to demonstrate a lower level of 
transparency due to the increasing level of model complexity and the lack of code accessibility. Li et al. [9] 
described possible directions for deeper understanding, helping bridge the gap between psychology/cognitive 
science and computational methodologies. It focused on opinion holder's basic needs and their resultant 
objectives, and functional model of sentiment gave the basis to clarifying the reason a sentiment valence is 
held. Chandre et al. [10] focused on making a reversed review to help supervised sentiment classification 
which gives us knowledge in sentiment analysis and opinion mining. Corpus-based pseudo-antonym 
dictionary implicated a comprehensive, practical approach when compelled with restricted lexical assets and 
domain knowledge. 

Kumar et al. [11] designed to extract the maximum and accurate product features from a huge 
number of online product reviews. Comprehensive feature extraction approach performed superior to 
anything the specific way for extracting the product features in the semantic environment. Jain et al. [12] 
developed a novel localized opinion mining model based on common sense data extracted from the 
ConceptNet ontology. The technique permitted interpretation and usage of information extracted from web- 
based social networking sites "Twitter" to identify public opinions. It is utilized to calculate senti-score and 
build a machine learning model that classifies the client opinions. Arya et al. [13] concentrated on a review of 
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Opinion mining and sentiment investigation as it is the process of examining the content (opinion or review) 
about a subject written in a natural language and characterize them as positive, negative or neutral based on 
the human's sentiments required in it. Mandal et al. [14] discussed a novel dictionary-based algorithm that 
utilizations vocabulary-based approach for conclusion mining and computes the assessment extremity levels. 
Lexicon based approach of content classification for opinion mining was utilized a dictionary containing 
words that mean feeling. Mane et al. [15] designed a new approach for Opinion Mining of Amazon reviews. 
The system removed fake reviews and performs opinion mining on genuine reviews to rate the items. The 
Apache Spark framework was utilized for the speed of processing information expanded and it investigation 
takes lesser time. 

Patel et al. [16] introduced a comparative investigation of various methodologies for multilingual 
sentiment analysis. These methodologies are partitioned into two sections: one utilizing classification of 
content without language translation and second utilizing the translation of testing information to a target 
language before classification. Wu et al. [17] focused on social choice theory and a collective decision- 
making model, it is integrated strategy to help the collective decision-making process based on the analysis 
of people's social roles and enhanced the effectiveness of the collective decision-making process. Ashok 
Kumar et al. [18] discussed a multi-aspect based opinion mining framework is proposed for open and 
distance education to calculate the fulfillment of the public. The framework required for the data gathering 
process, data preprocessing, feature extraction, opinion discovery at the title level, report level, sentence 
level, and aspect level, opinion perception, opinion classification. Clavel et al. [19] introduced different 
avenues for the integration of sentiment analysis in face-to-face human-agent interactions. Sufficient psycho- 
linguistic model utilized to describe human- agent effective dialogues. Semantic rules and machine learning 
techniques integrated the multi-modal idea of sentiment related phenomena, the variability of temporal and 
decision frames, differing levels of complexity required by the timing constraint of the interaction, and the 
heterogeneity. Pham et al. [20] developed a new model based neural network utilizing both known aspect 
ratings and the overall ratings of reviews to decide the overall aspect weights. The overall rating of a review 
is derived from aspect ratings. Prakash et al. [22] implemented an approximation automated structure, called 
Filtered Wall (FW) and it filtered disposed of substance from OSN client substances. The goal is to utilize 
efficient classification procedure to stay away from overpowered by unsuccessful messages. In OSNs, 
content filtering can also be abused for a unique, more reactive. 

In [23] explained integration of Adaptive Weight Ranking Policy (AWRP) with intelligent 
classifiers (NB-AWRP-DA and J48-AWRP-DA) via dynamic aging factor to improve classifiers power of 
prediction. The methods are used to choose the best subset of features. In [24] introduced a new framework 
called Fuzzy based contextual recommendation system for classification of customer reviews. It extracts the 
information from the reviews based on the context given by users. In [25] studied to identify the best 
classifiers for class imbalanced health datasets through a cost-based comparison of classifier performance. 
The unequal misclassification costs were represented in a cost matrix, and cost-benefit. 

To overcome the above problems, proposed an Efficient Feature Extraction and Classification 
(EFEC) algorithm is designed to solve the opinion classification, feature extraction problem from customer's 
utilized product reviews which extracts the feature words and opinion words from product reviews dataset. 
The technique is preferable to utilize the product review reports. Extracted rules are applied on evaluation 
process to verify whether the rules are appropriate or not. Meantime, the best rules implemented on the pre- 
processed dataset to extract a feature from opinion words. The reviewer usually marks both positive and 
negative parts of the reviewed product, despite the fact that their general opinion on the product may be 
positive or negative. Efficient Feature Extraction and Classification (EFEC) algorithm is utilized to predict 
the number of positive and negative opinion in reviews. The positive and negative labels gathered in opinion 
words. Illustrations of review comments are in long sentences. The system extracts from the number of 
sentences like excellent, good, for positive and poor, bad for negative opinions mining. The subsequent stage 
is to identify the number of positive and negative opinions of each extracted parts. The paper contributions 
are given below in details: 

e To design an efficient product opinion mining algorithm for finding more accurate product 
features on opinion mining that have been reviewed on by reviewers. 

e ~=6 To offers robust mechanism for solving product feature extraction problem from customer 
reviews using Efficient Feature Extraction and Classification (EFEC) algorithm 

e —_ To find the number of positive and negative opinion in reviews using opinion words utilizing 
EFEC methodology. 

e To avoiding fake rating reviews from product rating, the effective technique is applied to identify 
fake reviews online from filter reviews. 

e To reduce the product retrieval time, improve product review classification accuracy and success 
rate compare than existing techniques. 
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The rest of papers are organized as: Section 2 express the literature study which is very close to 
proposed methodology. Section 3 explains the proposed methodology system architecture and 
implementations pre-processing steps with algorithm pseudo code. Section 4 discusses programming setup, 
performance evaluation matrix with comparative result analysis. Section 5 summarizes the overall work with 
future outcomes. 


2. PROPOSED METHOD 

The segment explains the synopsis of the proposed methodology, system architecture, and algorithm 
explanation with pseudo code. The proposed technique tries to solve complexity for implementation of 
product feature extraction problem in customer reviews of opinion mining. The feature extraction based 
opinion mining methods are previously available. However, these methods are unsuccessful in completing 
the current requirement of opinion mining applications. The here experimental study requires studying to 
bring the efficient method to complete the gap between current methods and opinion mining application 
requirements. 


2.1 System Architecture 

In the section established the system architecture with proposed techniques and algorithm details. 
The system detailed implementation process step by step with design architecture in Figure 1. The objective 
of the proposed system is finding perfect product features on opinion mining that has been reviewed on by 
customers. These systems also provide the interaction between customer and administrator is to applying 
Efficient Feature Extraction and Classification (EFEC) method for solving product feature extraction 
problem from customer reviews. Proposed Efficient Feature Extraction and Classification (EFEC) algorithm 
works to classify positive and negative opinions in reviews using opinion words, reduce the product retrieval 
time, and improve product review classification accuracy and success rate. 
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Figure 1. System Architecture Diagram 


2.2 Implementation Pre-processing Steps 
2.2.1. Authentication 

The customer has to provide the entire details about him/her to generate a new account. After 
effectively completion of account creation simply the customer can able to perform the online shopping 
facility. Once the customer submits their details, information is recognized, and the customer can log in with 
their customer id and password for purchasing products. In the module, it verifies whether the authorized 
customer is accessing and it does not allow other customers to access. 


2.2.2. Purchasing 

The customer can purchase products and also has the facility to offer reviews with suggestions. The 
proprietor can contain product features (product name, cost, validity etc.) based on the classification likes 
mobiles, computers, laptops, cars, etc., and maintain the product details. The customer enters their credit card 
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details; the credit card is authenticated. If the card details are legitimate, the customer can purchase their 
products. The customer can choose purchasing products displayed in the first page or investigate the product 
utilizing keyword or based on type. The customer can buy the product utilizing credit/debit card. To 
purchase, the customer requires the following details like (credit card number, cardholder name, date of birth, 
credit card provider). If the credit card is legitimate, the customer is permitted to purchase the product. 


2.2.3. Feature Extraction 

To Extracting and examine opinions, from product reviews, it is unacceptable to acquire the 
common opinion about a product simply. In the majority cases, customers expect to discover opinions about 
feature of a product that is analyzed. Readers hope to realize that the reviewer opinions are positive opinion 
of the product and a negative opinion of the product, not just the reviewer's general opinion. To satisfying 
that aim, both product feature and opinion words must be recognized which applying Efficient Feature 
Extraction and Classification (EFEC) Algorithm. First, however, it is important to extract and build a product 
feature list and an opinion word lexicon, both of which can give previous knowledge that is helpful for 
opinion mining. 


2.2.4. Rating 

The customer is permitted to have the facility of giving their review in the form of ratings regarding 
the service provider. Customer ratings are considered as one of the essential features as they assume a vital 
part in the purchase of the product. Wrong/unfair ratings may prompt to serious issues in numerous 
frameworks. So in the module, we collect the customer reviews and secure them. 


2.2.5. Classification 

The entire customer profiles value and reviews are gathered. Customer profiles esteem additionally 
incorporates their time, duration and reviews and so forth. All the customer profiles including evaluations 
esteems are saved safely. Every one of the information's gathered is utilized as a dataset. In the Dataset, we 
classify the Positive and Negative opinions by amount of reviews provided using SVM algorithm. 


2.2.6. Positive and Negative Reviews 

In the module, develop the framework to such that customer of the portal can have the rights to give 
the positive and negative reviews to the product which he/she purchases, such that the administrator can view 
the list of reviews. 


2.3. Efficient Feature Extraction and Classification (EFEC) Algorithm 

To discovering the various product features, EFEC algorithm is utilized to enhance feature 
extraction and classification of positive and negative opinions. In the framework, a product set is a set of 
words or a phrase that occurs together. The method predicts features that appear on a lot of opinions have 
more possibility to be related, and consequently, more likely to be a real product feature. The classification 
algorithm is utilized to find the amount of positive, negative or neutral opinion words in product reviews 
utilizing opinion words. Opinion words are encoded a big situation like excellent, good have a positive 
opinion, while opinion words that stand for unwanted states like poor, bad, disappointing have a negative 
opinion. The word includes opinion words are taken as opinion sentence. I[]lustrations of positive opinion 
words are excellent and good, and the negative opinion words are like poor, bad and so on. To recognize the 
amount of positive, negative or neutral product opinions of every extracted product feature. These opinion 
words are extracted utilizing Efficient Feature Extraction and Classification (EFEC). 

An Efficient Feature Extraction and Classification is a classification algorithm utilized for text 
classification. The product review text to be categorized is changed into word. EFEC builds a hyper plane 
utilizing these words which divides data illustrations of one class from another. The unique feature of EFEC 
is that it can study. Still, it gives large data. It efforts glowing for text classification since it can handle huge 
features. Improvement of EFEC is robust when there is a small set of illustrations allocated over a huge 
region. It has provided reliable outcomes in the research in opinion mining. A product review data 
categorization is a two-stage procedure. In the first stage, a categorization algorithm constructs the classifier 
by "learning from" a training set made up of our corpus and their correlated class labels. In the second stage, 
the model is used for categorization. A different set called test set is utilized to estimate the exactness of the 
constructed model. 
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Input: Load the input Dataset DB (Customer Review Datasets) 
Output: Visualize the Tabular Result TR with Quality of Information (Accuracy, Precision, Recall 
and F-measure) 

Begin: 

User authentication process 

Purchase any product 

Post the feedback on product reviews 

Store centralized database 

Apply EFEC algorithm 

Collection of opinion word in container of keywords 
Extract the features 

If feature is positive 

To predict positive features 

Else 

To predict negative features 

End If 

Compare opinions from feature dataset and reviews 
If negative word appears in feature 

Negative opinion word is predicted 

Else 

Positive opinion word is predicted 

End If 

Predict positive and negative opinions 

Retrieve result and data visualization 

Calculate accuracy, precision, recall and F-measure 


End. 


3. RESULTS AND ANALYSIS 
3.1. Programming Setup 

To comparing, the proposed system with existing methodologies, the deployment procedure is 
performed on a system among Intel Core i7 7600 processor, 83GB memory, along with Window 7 system. 
Here, the method implemented in JAVA using NetBeans 8.0 with Apache Tomcat 8.0.3 and MYSQL 5.5 
Database. The Proposed algorithm is calculated with numerous kinds of opinion dataset to estimate the 
effectiveness of proposed systems. 


3.2. Dataset 

The proposed methodology utilized Customer Review Datasets (CRD) which includes reviews 
about product efficiently. A product review is a personal text including a sequence of words defining 
opinions of reviewer considering a particular product. Product review text may include complete sentences, 
little remarks, or both. Product reviews are gathered from websites like www.amazon.com, 
www.epinions.com and www.cnet.com. Every product review in websites is allotted with a different rating 
like 0-5 stars, a product review date, a reviewer name and location, a manufactured products name, and the 
product review content. 


3.3. Performance Evaluation Matrix 

The proposed methodology discovers the estimation metrics namely accuracy, precision, recall and 
F-measure to compute effectiveness of the proposed method and overcome the previous mechanisms in 
opinion mining. In the methodology enhances feature extraction and opinion classification. The methodology 
calculates the accuracy, precision, recall and F-measure. The following evaluation parameters are explained 
below in details. 


3.3.1. Accuracy 

Accuracy is defined as the sum of the true predictions such as divided by the total number of 
predictions. True positives and true negatives are described as the number of products correctly estimated as 
positive and negative. False positives and false negatives are defined as the number of products incorrectly 
computed as positive and negative. 
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True Positives + True Negatives 





Accuracy = 
- True Positives + True Negatives + False Positives + False Negatives 


3.3.2. Precision 
Precision is described as the ratio of products is correctly predicted positives divided by total 
products are correctly and incorrectly predicted positives. 


_ True Positives 
Precision = 





True Positives + False Positives 


3.3.3. Recall 
The recall is defined as the ratio of products correctly predicted as positives divided by the sum of 
products is correctly predicted as positives and sentences are incorrectly predicted as negatives. 


poauti True Positives 
ecall = $i A__—___—_—_—__—_. 
True Positives + False Negatives 


3.3.4. F-measure 
F-measure is described as the weighted harmonic mean of precision and recall. The Fl measure 
communicates the balance between the precision and the recall. 


2 * Precision * Recall 
F — measure = ————__———- 
Precision + Recall 


Table 1. Comparison of Accuracy, Precision, Recall, and F-measure 








Algorithm Accuracy Precision Recall F-measure 
Naive Bayes 74.76 79.54 75.86 73.75 
SVM 82.85 84.45 82.13 82.38 
ME 79.04 81.75 79.99 78.59 
EFES+J48 97.90 98.15 97.72 97.45 





Table 1 demonstrates the Accuracy, Precision, Recall and F-measure for input aspects with existing 
methodologies. Table | shows the average value of all estimation aspects with input aspects. The proposed 
system is estimated with following existing classifiers namely: Naive Bayes [21], Support Vector Machine 
(SVM) [21] and Maximum Entropy [21] classifiers. According to Table1, it noticed that EFEC algorithm has 
the best score on every specify aspects for classification. 
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According to Figure 2 to 5 estimations, it observed that the proposed technique is evaluated based 
on accuracy, precision, recall, and F-measure. Proposed EFEC is estimated with Naive Bayes (NB) [21], 
Support Vector Machine (SVM) [21] and Maximum Entropy (ME) [21] methodologies behalf of accuracy, 
precision, recall and F-measure. SVM is the closest challenger. It enhances the classification problem of 
product reviews. However, SVM is provided with the less accuracy. An EFEC strategy enhances the feature 
extraction and classification accuracy 15.05%, precision 13.7%, recall 15.59% and F-measure 15.07%. 
Lastly, the paper declares the proposed EFEC algorithm is best in all several aspects). 


4. CONCLUSION 

In Conclusion of the paper, a proposed technique extract features in product reviews. The nouns and 
noun phrases are extracted from every product review. An Efficient Feature Extraction and Classification 
(EFEC) methodology is utilized to find all various features for the provided product review. Efficient Feature 
Extraction and Classification (EFEC) method is utilized to recognize whether the sentence is positive or 
negative opinion and also recognize the amount of positive and negative opinion of every extracted feature. 
The amount of positive and negative opinions in a product review is calculated. An opinion word provides 
the classification accuracy. An EFEC strategy enhances the feature extraction and classification accuracy 
15.05%, precision 13.7%, recall 15.59% and F-measure 15.07%. Finally, the paper declares the proposed 
EFEC algorithm is best in all several aspects. 

In future, the paper can be extended to extract the trustworthy user opinion to improve the product 
quality and user interaction behalf of product from prior user experience in Hadoop Environment. 
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